AITopics | vc dimension

Collaborating Authors

vc dimension

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Contradiction Graphs Determine VC Dimension

Campbell, Jesse, Ibaibarriaga, Daniel, Reyzin, Lev

arXiv.org Machine LearningMay-21-2026

The Vapnik-Chervonenkis dimension is the fundamental combinatorial parameter of distribution-free binary classification. Introduced by Vapnik and Chervonenkis in their work on uniform convergence [VC71], and closely connected to the Sauer-Shelah lemma [Sau72, She72], it characterizes classical PAC learnability [Val84, BEHW89, EHKV89]. In particular, finite VC dimension is equivalent to distribution-free learnability. This paper asks whether that finite-versus-infinite VC dichotomy is still visible after replacing a concept class by its contradiction graphs. For a binary class H {0,1}X, the order-m contradiction graph Gm(H) has as vertices the H-realizable labeled samples of length m, with an edge between two samples if they assign opposite labels to some common domain point. Throughout, samples are ordered sequences, and repetitions of domain points are allowed, subject to consistent labeling. We use the contradiction-graph framework introduced by Alon et al. in their graph-theoretic characterization of private learnability [AMSY24]. They ask whether two binary classes can have isomorphic contradiction graphs at every level while one has finite VC dimension and the other has infinite VC dimension.

artificial intelligence, clique, machine learning, (18 more...)

arXiv.org Machine Learning

2605.20434

Country: North America > United States > Illinois > Cook County > Chicago (0.40)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

What is Learnable in Valiant's Theory of the Learnable?

Hanneke, Steve, Mehrotra, Anay, Velegkas, Grigoris, Zampetakis, Manolis

arXiv.org Machine LearningMay-14-2026

Valiant's 1984 paper is widely credited with introducing the PAC learning model, but it, in fact, introduced a different model: unlike PAC learning, the learner receives only positives, may issue membership queries, and must output a hypothesis with no false positives. Prior work characterized variants, including the case without queries. We revisit Valiant's original model and ask: *Which classes are learnable in it?* For every finite domain, including Valiant's Boolean-hypercube setting, we show that a class is learnable if and only if every realizable positive sample can be certified by a poly-size adaptive query-compression scheme. This is a new variant of sample compression where the learner certifies samples via a short interaction with the membership oracle. Our characterization shows that learnability in Valiant's model is strictly sandwiched between learnability in the PAC model and the variant of Valiant's model without membership queries. This is one of the rare cases where introducing membership queries changes the set of learnable classes, and not just the sample or computational complexity. Next, we study the natural extension of the model to arbitrary domains. While we do not obtain an exact characterization, our techniques readily generalize and show that the same strict sandwiching persists. Finally, we show that $d$-dimensional halfspaces, which are not learnable without queries, are learnable with queries: we give a $\mathrm{poly}(d) \tilde{O}(1/ε)$ sample and $\mathrm{poly}(d) \mathrm{polylog}(1/ε)$ query algorithm, and prove that at least $Ω(d)$ samples or queries are necessary. To our knowledge, this is the first algorithm for halfspaces in Valiant's model. Together, these results uncover a surprisingly rich theory behind Valiant's original notion of learnability and introduce ideas that may be of independent interest in learning theory.

artificial intelligence, machine learning, valiant, (17 more...)

arXiv.org Machine Learning

2605.1384

Country: North America > United States (0.27)

Genre: Research Report (1.00)

Industry:

Education (0.45)
Energy (0.45)
Law (0.33)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

Null Measurability at the Symmetrization Interface in VC Learning

Gupta, Dhruv

arXiv.org Machine LearningApr-29-2026

Recent work revisiting measurability in the fundamental theorem of statistical learning imposes Borel measurability of ghost-gap suprema. We show that, at the one-sided ghost-gap interface actually used by the standard symmetrization proof, this requirement is stronger than necessary. For any Borel-parameterized concept class on a Polish domain, the bad event "there exists a hypothesis whose ghost empirical error exceeds its training empirical error by at least ε/2" is analytic. By Choquet capacitability, it is therefore measurable in the completion of every finite Borel measure. We then construct a concept class whose bad event is null-measurable but not Borel, giving a strict separation from the Borel supremum condition. Finally, we prove closure under patching, fixed and countable interpolation, and fiber-product amalgamation, showing that the weaker regularity level is stable under natural concept-class constructors. In the realizable setting, where targets belong to the class and are measurable, these results weaken the measurability hypothesis needed by the symmetrization route from finite VC dimension to PAC learnability. The main results and the descriptive-set-theoretic infrastructure used by them are formalized in Lean 4.

artificial intelligence, bad event, machine learning, (17 more...)

arXiv.org Machine Learning

2604.25028

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

Add feedback

2d76b6a9f96181ab717c1a15ab9302e1-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 06:53:18 GMT

artificial intelligence, machine learning, probability, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

On the Recursive Teaching Dimension of VC Classes

Xi Chen, Xi Chen, Yu Cheng, Bo Tang

Neural Information Processing SystemsMar-23-2026, 11:38:44 GMT

The recursive teaching dimension (RTD) of a concept class C {0,1}n, introduced by Zilles et al. [ZLHZ11], is a complexity parameter measured by the worst-case number of labeled examples needed to learn any target concept of C in the recursive teaching model. In this paper, we study the quantitative relation between RTD and the well-known learning complexity measure VC dimension (VCD), and improve the best known upper and (worst-case) lower bounds on the recursive teaching dimension with respect to the VC dimension. Given a concept class C {0,1}n with VCD(C) = d, we first show that RTD(C) is at most d 2d+1. This is the first upper bound for RTD(C)that depends only on VCD(C), independent of the size of the concept class |C| and its domain size n. Before our work, the best known upper bound for RTD(C) is O(d2d loglog|C|), obtained by Moran et al. [MSWY15].

artificial intelligence, dimension, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.28)
Europe (0.28)

Industry: Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Supervised learning through the lens of compression

Ofir David, Shay Moran, Amir Yehudayoff

Neural Information Processing SystemsMar-23-2026, 09:56:17 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, compression scheme, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.40)

Add feedback

Multi-step learning and underlying structure in statistical models

Maia Fraser

Neural Information Processing SystemsMar-23-2026, 06:54:58 GMT

In multi-step learning, where a final learning task is accomplished via a sequence of intermediate learning tasks, the intuition is that successive steps or levels transform the initial data into representations more and more "suited" to the final learning task. A related principle arises in transfer-learning where Baxter (2000) proposed a theoretical framework to study how learning multiple tasks transforms the inductive bias of a learner. The most widespread multi-step learning approach is semisupervised learning with two steps: unsupervised, then supervised. Several authors (Castelli-Cover, 1996; Balcan-Blum, 2005; Niyogi, 2008; Ben-David et al, 2008; Urner et al, 2011) have analyzed SSL, with Balcan-Blum (2005) proposing a version of the PAC learning framework augmented by a "compatibility function" to link concept class and unlabeled data distribution. We propose to analyze SSL and other multi-step learning approaches, much in the spirit of Baxter's framework, by defining a learning problem generatively as a joint statistical model on X Y.

artificial intelligence, learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Education (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Transformation-Invariant Learning and Theoretical Guarantees for OOD Generalization

Neural Information Processing SystemsMar-22-2026, 10:12:45 GMT

Learning with identical train and test distributions has been extensively investigated both practically and theoretically. Much remains to be understood, however, in statistical learning under distribution shifts. This paper focuses on a distribution shift setting where train and test distributions can be related by classes of (data) transformation maps. We initiate a theoretical study for this framework, investigating learning scenarios where the target class of transformations is either known or unknown. We establish learning rules and algorithmic reductions to Empirical Risk Minimization (ERM), accompanied with learning guarantees. We obtain upper bounds on the sample complexity in terms of the VC dimension of the class composing predictors with transformations, which we show in many cases is not much larger than the VC dimension of the class of predictors. We highlight that the learning rules we derive offer a game-theoretic viewpoint on distribution shift: a learner searching for predictors and an adversary searching for transformation maps to respectively minimize and maximize the worst-case loss.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Recursive Teaching Dimension of VC Classes

Neural Information Processing SystemsMar-17-2026, 08:55:32 GMT

The recursive teaching dimension (RTD) of a concept class $C \subseteq \{0, 1\}^n$, introduced by Zilles et al. [ZLHZ11], is a complexity parameter measured by the worst-case number of labeled examples needed to learn any target concept of $C$ in the recursive teaching model. In this paper, we study the quantitative relation between RTD and the well-known learning complexity measure VC dimension (VCD), and improve the best known upper and (worst-case) lower bounds on the recursive teaching dimension with respect to the VC dimension. Given a concept class $C \subseteq \{0, 1\}^n$ with $VCD(C) = d$, we first show that $RTD(C)$ is at most $d 2^{d+1}$. This is the first upper bound for $RTD(C)$ that depends only on $VCD(C)$, independent of the size of the concept class $|C|$ and its~domain size $n$.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

Filters

Collaborating Authors

vc dimension

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Contradiction Graphs Determine VC Dimension

What is Learnable in Valiant's Theory of the Learnable?

Null Measurability at the Symmetrization Interface in VC Learning

1967f962c7c2083618236d80eeb9d1ac-Paper-Conference.pdf

2d76b6a9f96181ab717c1a15ab9302e1-Supplemental-Conference.pdf

On the Recursive Teaching Dimension of VC Classes

Supervised learning through the lens of compression

Multi-step learning and underlying structure in statistical models

Transformation-Invariant Learning and Theoretical Guarantees for OOD Generalization

On the Recursive Teaching Dimension of VC Classes